Meridian's Bob Stuart at the Manhattan launch, showing the law of diminishing returns regarding increasing the sample rate of PCM encoding.
In almost 40 years of attending audio press events, only rarely have I come away feeling that I was present at the birth of a new world. In March 1979, I visited the Philips Research Center in Eindhoven, Holland and heard a prototype of what was to be later called the Compact Disc. In the summer of 1982, I visited Ron Genereux and Bob Berkovitz at Acoustic Research's lab near Boston and heard a very early example of the application of DSP to the correction of room acoustic problems. And in early December, at Meridian's New York offices, I heard Bob Stuart describe the UK company's MQA technology, followed by a demonstration that blew my socks off.
With a pair of Meridian digital active speakers being fed audio data from a laptop, Bob was playing 24-bit files with sample rates up to 192kHz, yet the data rate was not much more than the CD's 1.5Mbps! Not only that, but there was palpability to the sound, a transparency to the original event, that I have almost never heard before, which Jason Victor Serinus can testify to (see below).
MQA is the result of years of research by Bob Stuart and his collaborator, the renowned English engineer Peter Craven. (For more Information on MQA, click here.) It involved going back to first principles, and examining modern research on the perception of sound as well as a fundamental examination of the nature of music. (Bob presented a paper on this concept to the Audio Engineering Society in Los Angeles last October: "A Hierarchical Approach to Archiving and Distribution" (downloadable here. Much more detail on the theoretical and philosophical basis for MQA than I have room for here in this report is available in Bob's paper.) Bob kindly shared his Powerpoint presentation with me and I reproduce some of the slides, in order to be able to describe what MQA does. The following is my interpretation of Bob's presentation; if there are errors, they are my own.
Fig.1 The peak spectrum of a Ravel String Quartet, sampled at 192kHz with 24-bit PCM. On such a diagram, data-rate is equivalent to area
Fig.1 is a variation on what is called a "Shannon Diagram"; it shows the information space required by an audio signal with level shown as the vertical scale in dB, frequency as the horizontal scale in kilohertz. The peak levels of a string quartet recording are plotted against frequency (red trace), while the blue trace shows the level of the recording's background noise. (Bob says that regardless of the type of music, this basic triangular shape in information space is characteristic, music and musical instruments having evolved to match our hearing sensitivity.) The gray shaded area to the left of the graph above the green line shows how much of this information space is preserved with a 16-bit/48kHz PCM recording. Below the green line is shown how much more is preserved with 24-bit PCM.
When the sample rate is doubled to 96kHz, the areas shaded in pink is now preserved in the recording; when it is doubled again, to 192kHz, the full information space is captured.
2) That the information present above around about 55kHz is noise.
3) That within the baseband (24kHz), the noisefloor is above the 16-bit quantization limit.
4) That the musical information only occupies a fraction of the information space, generally a triangle-shaped area with the greatest dynamic range at low frequencies and the smallest dynamic range at ultrasonic frequencies. This is because of the fundamentally self-similar nature of music with respect to its spectral content. The last point ties in with the observed fact that every time the sample rate of PCM is doubled, while there can be a slight improvement in sound quality, the data rate and necessary storage space for the file also double. Stereo 16/48k audio has a data rate of 1.54Mbps; 24/96k data 4.6Mbps; and 24/192k data 9.2Mbps. Linear PCM treats every coordinate of the information space as being equally important, hence the inefficiency in terms of capturing and transmitting musical data in PCM form. Bob Stuart then introduced the concept of "Audio Origami": of folding the ultrasonic-frequency components of music back into the baseband so that the data bandwidth could be reduced.
Fig.2 MQA uses the statistics of music information to encode the 4Fs-octave data within a small area in the information space below the 4Fs Nyquist Frequency.
The first step is shown above. The musical information above 48kHz has a very small dynamic range, so despite the sample rate required being 192kHz, the area it occupies in the information space diagram ("C") is small. MQA encapsulates this data in the small rectangular area around –130dBFS between 24 and 48kHz, an area that with all real recordings would otherwise be random numbers ie, noise. The result is a 96kHz-sampled file that contains within itself musical data sampled at 192kHz.
Fig.3 The same encapsulation process can be performed on the 2Fs octave data.
But there is no reason to stop there. Fig.3 shows how MQA encapsulates the musical information between 24 and 48kHz sampled at 96kHz ("B") in the baseband information space "A." While the dynamic range of the musical information in this octave is greater than that above 48kHz, the encapsulation uses lossless encoding, resulting in another small rectangular area than can be buried beneath the recording's noisefloor in the base band below 24kHz.
Fig.4 A MQA 24-bit file sampled at 48kHz contains all the information necessary to recreate a music signal sampled at 192kHz.
Fig.4 shows the final result of this folding and packing: a 24-bit MQA file sampled at 48kHz contains all the musical information corresponding to an original recording sampled at 192kHz. You are not getting something for nothing: The data above the baseband Fs is packed sufficiently beneath the recording's noisefloor, using subtractive dither, in an information space area that would otherwise be random, that it will not have audible consequences. When this file is played back with an MQA decoder, it unfolds to give the original resolution and bandwidth required to playback the music without loss.
Lossless compression of this file gives a further reduction in data rate, to what I understood to be about 1.5Mbps per channel for a stereo recording, only slightly more than that required for uncompressed CD audio and about twice that required for transmission of a 16/44.1k FLAC or ALAC file (my guesstimate).
Backward CompatibilityIf you look again at fig.4, above, you can see the green line that defines the 16-bit noisefloor of the 16-bit PCM format. All the extra encapsulated information lies well below that limit. So if the file is truncated at the 16th bit, you have the equivalent of a normal baseband digital recording, sampled at 48kHz in this example. To a DAC or player that doesn't have MQA decoding, the MQA file will play as a normal 16-bit file. This means that record companies will not have to offer multiple file formats. A single inventory will serve both the general public and audiophiles. Atlantic chairman Craig Kallman was at the Manhattan event and when I talked with him, was very enthusiastic about this aspect of MQA. There is an analogy here with the LP record: one inventory that can be played on massmarket players, yet responds with increased quality for those who invest in better players. And MQA intends to guarantee the provenance for the high-resolution masters—no more upsampled CD masters masquerading as "hi-rez."
Returning to the triangle of musical information in fig.1, the old question is why do we need to preserve and reproduce frequencies above the limit of human hearing, even if we can do it? Bob spent some time discussing this in his presentation and it comes down to the fact that the ear-brain doesn't just operate as a frequency analyzer. Evolution has fine-tuned the system to be able to detect temporal differences that are equivalent to a bandwidth considerably greater than 20kHz and that the anti-aliasing filters in A/D converters and reconstruction filters in D/A converters introduce temporal smearing that it is considerably greater than what our ear-brains are tuned to expect from natural sounds: this smearing is, I believe, responsible for so-called "digital" sound. The MQA encoder and decoder together have been designed to have a transient response of the same form and order as that of the temporal sensitivity of the ear-brain. And if at the MQA-encoding stage, the temporal effect of the A/D converter can be compensated for, the complete system offers a transparent window into the original musical event. Meridian describes this as "taking an original master further, toward the original performance, in an analogous way to the processes expert antique picture restorers use to clean the grime and discolored varnish from an Old Master to reveal the original color and vibrancy of the work." Judging by the recordings I heard in Manhattan, some dating back to the early 1950s, I feel the launch of Meridian's MQA is as important to the quality of sound recording and playback as digital was 40 years ago. I have sent Bob the hi-rez PCM masters for some of my own recordings so I will be able to judge for myself the impact the technology has. In the meantime, Jason Serinus's reaction to a recent dem of MQA in California follows on the next page.















